High-throughput determination of RNA tertiary contact thermodynamics by quantitative DMS chemical mapping

Abstract Structured RNAs often contain long-range tertiary contacts that are critical to their function. Despite the importance of tertiary contacts, methods to measure their thermodynamics are low throughput or require specialized instruments. Here, we introduce a new quantitative chemical mapping method (qMaPseq) to measure Mg2+-induced formation of tertiary contact thermodynamics in a high-throughput manner using standard biochemistry equipment. With qMaPseq, we measured the ΔG of 98 unique tetraloop/tetraloop receptor (TL/TLR) variants in a one-pot reaction. These results agree well with measurements from specialized instruments (R2= 0.64). Furthermore, the DMS reactivity of the TL directly correlates to the stability of the contact (R2= 0.68), the first direct evidence that a single DMS reactivity measurement reports on thermodynamics. Combined with structure prediction, DMS reactivity allowed the development of experimentally accurate 3D models of TLR mutants. These results demonstrate that qMaPseq is broadly accessible, high-throughput and directly links DMS reactivity to thermodynamics.


Introduction
Structured RNAs play critical roles in cellular functions, are foundational to the life cycle of RNA viruses, and serve as blueprints for new artificial machines (4)(5)(6)(7)(8)(9).RNAs must fold into intricate 3D structures to perform these diverse functions.This folding typically occurs in two distinct steps.Initially, nucleotide base pairs form, creating a secondary structure with helices, junctions and loops.Subsequently, these secondary structures may engage in further long-range interactions -known as tertiary contacts-which exhibit a remarkable structure and stability range.For instance, some tertiary contacts are transient, involving single adenines docking into a helix's minor groove (a-minor interactions) ( 10 ).In contrast, others encompass multiple base pairs or non-canonical interactions that provide enough thermodynamic stability to strongly favor a specific 3D conformation crucial for its func-tion ( 11 ).Understanding the thermodynamics of these individual tertiary contacts is vital for unraveling how biological RNAs fold into their functional forms.Moreover, this knowledge will be instrumental in designing advanced artificial RNAs, opening new frontiers in biotechnology and medicine.
Due to the importance of tertiary contacts, several methods have been utilized to investigate their stability experimentally.Single-molecule FRET, native gel electrophoresis, small angle X-ray scattering, chemical probing, and others have all been used to assess tertiary contact stability quantitatively ( 4 ,12-21 ).However, these time-intensive protocols require individual sample preparation, making them impractical for large-scale RNA studies.As the number of structured RNAs with potential tertiary contacts rapidly grows, there is a critical need for high-throughput methods.Greenleaf et al. have recently developed the RNA on a massively parallel array (RNA-MaP) platform ( 1 , 22 , 23 ).This device can simultaneously measure the stability of thousands of unique RNA tertiary contacts through fluorescence intensity.The RNA-MaP platform is a significant technological advance, leading to several important insights about tertiary contact formation ( 1 , 22 , 23 ).However, these devices are custom-built using existing Illumina sequencers and, thus, are not accessible to other scientists.Consequently, the scientific community still faces an urgent need for a simple, high-throughput method that is widely accessible.
Chemical modifiers, such as dimethyl sulfate (DMS), offer an alternative for measuring tertiary contact formation in RNAs.DMS methylates the N1 position of adenine and the N3 position of cytosine when accessible to solvent.These DMS reactivity profiles are traditionally employed to constrain RNA secondary structures, as the targeted nitrogen in Watson-Crick pairs is shielded from the DMS-induced methylation (24)(25)(26)(27)(28).However, several tertiary contacts, including kissing loops, pseudoknots, A-minor interactions, and the tetraloop / tetraloop receptor (TL / TLR), also protect against DMS methylation ( 4 ,29-32 ).Recently, DMS has been coupled with next-generation sequencing techniques such as mutational profiling with sequencing (DMS-MaPseq) (26)(27)(28).DMS-MaPseq allows for simple multiplexing of the DMS reaction to probe the DMS reactivity of thousands of RNAs in a one-pot reaction.DMS-MaPseq has led to multiple advances in understanding conformational changes in secondary structure but has yet to be used to measure 3D structure thermodynamics.
In physiological environments, the formation of many tertiary RNA contacts relies on divalent cations, particularly Mg 2+ , to mitigate the electrostatic repulsion of the phosphate backbone.Decades of research have demonstrated that the concentration of Mg 2+ required for a tertiary contact to form is directly related to its thermodynamic stability (33)(34)(35).The concentration of Mg 2+ required for 50% of the RNA molecules to form their tertiary contact is known as the [Mg 2+ ] 1 2 .A low [Mg 2+ ] 1 2 requirement suggests a more stable interaction capable of overcoming inherent electrostatic repulsion.Furthermore, using older reverse transcription stop methods, DMS chemical mapping has been employed to measure [Mg 2+ ] 1 2 for individual RNA constructs ( 4 ).We posited that DMS-MaPseq might allow simultaneous measurement of [Mg 2+ ] 1 2 for hundreds of RNAs in one-pot reactions.This new method would bring high-quality thermodynamic measurements to a much more extensive range of researchers.
This study introduces a new quantitative mutational profiling method (qMaPseq) that integrates [Mg 2+ ] titrations to measure the thermostability of tertiary contacts.qMaPseq can be performed on 1 to over 10 000 RNAs in a one-pot reaction, provides nucleotide-resolution data, is low cost, and uses common reagents found in any molecular biology lab.To benchmark qMaPseq, we used an RNA nanostructure containing a tetraloop / tetraloop receptor (TL / TLR) tertiary contact.We measured the thermostability of 98 TLR mutants and obtained results that were consistent with the specialized RNA-MaP platform.Surprisingly, the DMS reactivity of the TL directly correlates to the stability of the contact ( R 2 = 0.68), the first direct evidence that DMS reactivity reports on thermodynamics.qMaPseq data yields rich nucleotide-level data that contains embedded information about 3D structures.These data permitted observation of an Mg 2+ -induced conformational change in the TLR that explains discrepancies between the free TLR and TLR-bound structures.Lastly, we used these data to build preliminary 3D models of TLR mutant structures.Our findings indicate that qMaPseq can democratize and generalize high-quality thermodynamic methods of tertiary contacts that can be applied to a wide range of systems.

Designing TLR mutant libraries
Using the supplemental data from Bonilla et al. ( 1 ), we selected a test set from that study's original 1592 TLR constructs.First, we only considered mutants that fell into class A (277), which had a thermodynamic fingerprint similar to the wild-type.We reasoned that we only wanted to consider mutant binding modes similar to the wild-type for this first proof-of-concept study.We further filtered out constructs with few A and C nucleotides, removing all with less than three total A and C nucleotides.Lastly, we removed constructs that did not contain two Watson-Crick pairs flanking the tetraloop receptor.This left us with 98 mutants.We broke these into four sets that either had 25 or 24 constructs.We randomly selected helical sequences to ensure distinguishability among these mutants, as depicted in Supplementary Figure S22 .Initial studies indicate that changing the helical sequence does not alter the reactivity of the tetraloop or receptor.Furthermore, we extended the first helix by seven base pairs to include a unique barcode preceding the tetraloop receptor, achieving a hamming distance of 20 or greater between each sequence.

Generating DNA single templates from primer assembly
The sequence of each primer is listed in the Supplemental Document 'Primers.xlsx.'Additionally, the primers for each DNA template are summarized under the primer assembly tab.First, the middle primers (P2-P5 for miniTTR 6 wildtype and P2-P3 for all other templates) were combined and diluted to 1 μM concentration in 100 μl solution.Then, the PCR reaction was mixed by combining 25 μl of Platinum II Hot-Start PCR master mix (ThermoFisher #14000012), two μl of the diluted middle primer solution, two μl of the corresponding forward and reverse primers (P1 and P6 for miniTTR 6 wildtype and P1 and P4 for all other templates) at 100 μM concentration, and 19 μl of RNase-free UltraPure water (ThermoFisher #10977015).The reaction was run in the thermocycler with an initial denaturation for 2 min at 94 • C, followed by 30 cycles of denaturation, annealing, and extension steps done for 30 s each at 94 • C, 60 • C, and 72 • C, respectively .Lastly , a final extension of 5 min at 72 • C was performed.The PCR products were then run on a 2% agarose gel for 1 h at 150 V, and bands of the correct size were extracted and purified using a Zymoclean Gel DNA Recovery Kit (Genesee Scientific #11-301C).

Generating pooled DNA templates from oligonucleotide pools
The four miniTTR TLR mutant libraries were generated from single-stranded oligonucleotide 'Opools' from Integrated DNA Technologies (IDT).The oligo pools are delivered dry and solvated using 50 μl of IDTE pH 8.0 (IDT #11-05-01-13).All libraries were amplified using the same forward and reverse primers (forward: TTCT AA T A CGA CTCA CT A T AGG, reverse: GTTGTTGTTGTTGTTTCTTT).The primers were also ordered from IDT but were delivered at 100 μM concentration in IDTE pH 8.0 solution.Before setting up the PCR reaction, primers were diluted to 10 μM concentration.The PCR reaction was mixed by combining 25 μl of Q5 High-Fidelity DNA Polymerase (New England Biolabs #M0494S), 2 μl of the oligo pool solution, 2.5 μl of the diluted forward and reverse primers, and 18 μl of RNase-free UltraPure water (ThermoFisher #10977015).The reaction was run in a thermocycler for 20 cycles.Denaturation, annealing, and extension were done at 98 • C, 62 • C and 72 • C, respectively.The denaturation step was done for 10 s, while annealing and extension were done for 15 s.An initial denaturation for 30 s at 98 • C and a final extension for 5 min at 72 • C were also done.The PCR products were then run on a 2% agarose gel for 1 h at 150 V, and bands of the correct size were extracted and purified using a Zymoclean Gel DNA Recovery Kit (Genesee Scientific #11-301C).

In vitro transcription of RNA constructs
All constructs were transcribed in vitro from DNA into RNA.Before setting up the reaction, the following reagents were prepared: 10x Transcription (Tx) Buffer (400 mM tris-base, 10 mM spermidine, 0.1% Triton X), 25 mM NTP solution, 50 mM DTT and 250 mM MgCl 2 .Transcription reaction was prepared by combining 10 μl of 10 × Tx Buffer, 5 μl of 50 mM DTT, 16 μl of 25 mM NTP solution, 8 μl of 250 mM MgCl 2 , 4 μl of t7 polymerase (New England Biolabs #M0251S), 33 μl RNase-Free water, and 24 μl of purified DNA.The concentration of purified DNA was measured via nanodrop and adjusted to 0.3 μM before being added to the transcription reaction.Finally, the reaction mixture was incubated in the thermocycler at 37 • C for 6 h.After the reaction, the DNA template was digested using DNase I, and it was included in the RNA Clean and Concentrator-5 with the DNase I kit (Genesee Scientific #R1014).The RNA was then purified using the same kit.Before proceeding to the DMS MaPseq experiment, the concentration of RNA was measured through its absorbance value on the nanodrop.Additionally, we verified RNA length by running the sample through a 4% agarose denaturing gel for 1 h at 150 V.

DMS-MaPseq protocol
DMS-MaPseq was done on all purified RNA constructs.The specific conditions for RNA folding and RNA modification with DMS varied accordingly for the various buffer titrations and magnesium titrations that were done.First, we measured 10 pmol of RNA in 5 μl of RNase-Free water for each reaction condition.RNA was denatured in the thermocycler at 90 • C for 4 min and immediately cooled for 3 min at 4 • C. Then the RNA was pipetted into a folding solution containing 16.5 μl of buffer (at reach the desired concentration) and 1 μl of MgCl 2 at the desired concentrations.For example, for the condition specifying 50 mM sodium cacodylate and 10 mM Mg 2+ , 16.5 μl of 76 mM sodium cacodylate was mixed with 1 μl of 250 mM MgCl 2 to prepare the reach a final concentration of 50 mM sodium cacodylate and 10 mM MgCl 2 .All conditions were done similarly to prepare a final reaction volume of 25 ul.The RNA was allowed to fold in solution for 30 min at room temperature.During this time, a DMS solution was prepared in the fume hood.This was done by adding 15 μl of DMS (Sigma-Aldrich #D186309) to 85 μl of 100% ethanol (Decon Labs cat.#2716).Once RNA was folded for 30 min, 2.5 μl of DMS solution was added to the RNA buffer solution to modify the RNA.The reaction was quenched 6 min after DMS with 25 μl of BME (ThermoFisher cat.#125470010) to the solution.Then, the modified RNA was purified using the RNA Clean & Concentrator-5 kit (Genesee Scientific #R1014).The RNA sample was eluted from the column with 7 μl of RNase-Free water.The concentration of RNA was measured using the Qubit RNA BR Assay Kit (ThermoFisher #Q10211).1 μl of RNA sample was used to measure the concentration of modified RNA.
To read out the methylations of A and C, we utilized TGIRT III reverse transcriptase, which incorporates mutations of the cDNA strand during reverse transcription.Before setting up the reverse transcription reaction, the following solutions were prepared: 5 × TGIRT buffer (250 mM Tris-HCl pH 8.3, 375 mM KCl, 15 mM MgCl 2 ), 10 mM dNTPs, 100 mM DTT, 0.4 M NaOH, and Quench Acid (1.43 M NaCl, 0.57 M HCl, 1.29 M sodium acetate).Additionally, the modified RNA sample was diluted to a concentration of 0.25 μM and one RTB primer to introduce a barcode for demultiplexing (Supplemental Document Primers.xslx) to 0.285 μM.Finally, the reverse transcription reaction was mixed by combining 2.4 μl of 5 × TGIRT buffer, 1.2 μl of 10 mM dNTP solution, 0.6 μl of 100 mM DTT, 0.5 μl of TIGIRT-III enzyme, 6.4 μl of 0.25 μM modified RNA, and 1 μl of 0.285 μM RTB primer.The reaction was mixed thoroughly and then incubated at 57C for 2 h.After incubation, 5 μl of 0.4 M NaOH was added to the reaction products.The sample was heated in the thermocycler at 90 • C for 4 min and then cooled immediately for 3 min at 4 • C.Then, 2.5 μl of quench acid was added to neutralize the NaOH.Note that the exact volume of quench acid required to neutralize the NaOH will vary slightly for different batches.Next, the cDNA product was purified using the Oligo Clean and Concentrator Kit (Genesee Scientific #11-380B).Before starting the purification, 30 μl of RNase-Free water was added to the reaction products to bring the volume to 50 μl.The cDNA was eluted from the column using 15 μl of RNase-free water.
Then, the cDNA was amplified with PCR, similar to how the original DNA templates were amplified from an oligo pool.The PCR used forward and reverse primers ordered from IDT dissolved in IDTE pH 8 buffer at a concentration of 100 μM.All constructs were amplified with the same FWD primer AATGATA CGGCGA CC ACCGAGATCTAC A CTCTTTCCCTA CA CGCGCTCTTCCG.The minittr-6 was amplified with reverse primer C AAGC AGAAGACGGC A T ACGAGA TCGGTCTCGGCA TTCCTGCTGAACCGCTC TTCCGATCTGGTACT ATGT ACCAAAG while all others were amplified with reverse primer C AAGC AGAAGACGG CAT ACGAGA TCGGTCTCGGCA TTCCTGCTGAACCGC TCTTCCGATCTGGAA CA GC ACTTCGGTGC AAA.These primers were diluted in a 1:10 dilution to a concentration of 10 uM before mixing the reaction.To mix the reaction, 25 μl of Q5 High-Fidelity DNA Polymerase (New England Biolabs #M0494S), 2.5 μl of diluted forward and reverse primer, 2.0 μl of purified cDNA, and 18 μl of RNase-Free water were combined.The reaction was run in a thermocycler for 16 cycles.Denaturation, annealing, and extension were done at 98 • C, 62 • C and 72 • C, respectively.The denaturation step was done for 10 s, while annealing and extension were done for 15 s.An initial denaturation for 30 s at 98 • C and a final extension for 5 min at 72 • C were also done.PCR products were gel purified using the E-Gel Power Snap Plus Electrophoresis System (ThermoFisher #G9301).5 μl of PCR products was run on a 2% E-gel EX Agarose Gel (ThermoFisher #G401002) for 10 min according to the settings programmed into the gel system.Then, the gel was imaged, and bands of the correct size were excised carefully using a metal spatula.The DNA was then purified using the Zymoclean Gel DNA Recovery Kit (Genesee Scientific #11-301C).Lastly, the purified DNA concentration was measured using the Qubit 1X dsDNA High Sensitivity Assay Kit (ThermoFisher #Q33230).
novobarcode -b rtb_barcodes.fa-f test_R1_001.fastqtest_R2_001.fastqwhere rtb_barcodes.fagives a list of sequencing barcodes, such as Where three barcodes are specified, distance is the distance in base pairs between a barcode and an allowable read.Format specifies that the barcode will be on the 5 end of the read 1.For the pools that contained TLR mutants, we performed an additional second demultiplexing on the helix under the TLR.These seven base pair helices are unique to each construct in the pool.This second step ensured there were no missed alignments, as even a few percent missed alignments of reads could add additional noise.Demultiplexing of internal helices was performed by custom software developed in-house ( https:// github.com/jyesselm/ barcode _ demultiplex ).
barcode_demultiplex -csv constructs.csv-fq1 test_R2_001.fastq-fq2 test_R1_001.fastq-helix 1 0 7 -m 1 constructs.csv is a comma-delimited formatted file containing the name, sequence, and structure in dot-bracket format of each RNA in a pool.The -fq1 and -fq2 flags take F ASTQ for -matted sequencing output files, with fastq1 as the file that contains the RNA sequence from 5 to 3 .-helix specifies which helix to use in occurrence from the 5 end.-helix 1 0 7 specifies using the second helix (note this is zero-indexed) and uses base pairs 0 to 7. Lastly, -m 1 indicates accepting sequences that are at most hamming distance of 1 away from the target barcode sequence.This command uses SeqKit ( 3 ) to find the occurrence of a given barcode in the supplied FASTQ files in the approximate position it should exist.It will ensure the barcode sequence exists in both the forward and reverse reads if possible.barcode_demultiplex outputs reads into separated paired FASTQ formatted files containing only reads with a given barcode identified.
Lastly, we used the RNA mutational profiling (RNA-MaP) tool to count mutations to determine the mutational fraction at each nucleotide position ( https://github.com/YesselmanLab/rna _ map ).This is an open-source tool developed to simplify mutational profiling analysis.
rna-map -fa test.fasta-fq1 test_mate1.fastq-fq2 test_mate2.fastq-dot-bracket test.csv-param-preset barcoded-libraries The rna-map command requires a FASTA-formatted file containing all DNA reference sequences and the paired sequencing reads generated from the previous step.We also supplied a CSV file containing each RNA's dot bracket structure with the -dot-bracket flag and applied stricter constraints to how well each sequence needs to align to a read -param-preset barcoded-libraries.All the data are stored in JSON format in a pandas DataFrame. of each construct, we took the 16-point [Mg 2+ ] titration data and extracted the DMS reactivity for the three adenines in the GAAA tetraloop ( average GAAA DMS ) .The first 25 construct library 5 mM [Mg 2+ ] points had unexpectedly high reactivity, so they were removed.To estimate [Mg 2+ ] 1 2 , we computed the relative protection values of each Mg 2+ concentration i, which were calculated below.

Computing [Mg
where f i is the relative DMS protection that a given Mg 2+ concentration imparts compared to the 0 Mg 2+ concentration condition, this number ranges from 0 (no protection) to 1 (completely protected).We fit these values using a modified Hill equation as used in a previous study ( 4 ).
) n where [ Mg 2+ ] i is the Mg 2+ concentration at a given measurement, A is the scaling factor, and n is the Hill coefficient.These fits were performed with SciPy's curve_fit function for each data set.One hundred bootstrap steps were performed to estimate the error in the fit.
To compute the G of a given tertiary contact formation can be calculated using the following equation.
R is the gas constant, and T is the temperature at 20

Computing
G can be done by subtracting any computed G from the G wt , which is the G of the wild type.

Measuring pH change of sodium cacodylate buffers during DMS reaction
Purified RNA was used to test the pH change during RNA modification.100 pmol of RNA in 50 μl of RNase-Free water was denatured in the thermocycler at 90 • C for 4 min and immediately cooled for 3 min at 4 • C. Then the RNA was pipetted into a folding solution.The first condition tested was the 50 mM sodium cacodylate buffer, which was increased ten times in volume to accommodate enough for an accurate reading from the pH probe.For example, the original method volume of folding buffer was 16.5 μl of the desired sodium cacodylate concentration along with 1 μl of 250 mM MgCl 2 .In this protocol, 165 μl of 76 mM sodium cacodylate was mixed with 10 μl 250 mM of MgCl 2 to reach a final concentration of 50 mM sodium cacodylate and 10 mM MgCl 2. The second condition tested was the 300 mM sodium cacodylate buffer.To create this buffer, 188 μl of 400 mM sodium cacodylate was mixed with 10 μl of 250 mM MgCl 2 to reach a final concentration of 300 mM sodium cacodylate and 10 mM MgCl 2 .To keep the final reaction volume the same and account for the 23 μl increase in the volume of sodium cacodylate in the 300mM buffer solution, the RNA for this protocol was added to only 27 μl of RNase-Free water.The RNA was then allowed to fold for 30 min in this solution at room temperature.While the RNA was folding, the pH probe was calibrated using three separate pH solutions: pH 4.01, pH 7.00 and pH 10.01 (Ther-moFisher cat.#15-200-909).During this time, a DMS solution was prepared in the fume hood by adding 15 μl of DMS to 85 μl of 100% ethanol.After the RNA had folded for 30 min, the pH probe was inserted into the RNA buffer solution.Once a stable pH was recorded, 25 μl of the DMS solution was added to the RNA buffer solution and the corresponding pH value was measured every 30 s. 12 total pH values were recorded over the 6 min DMS was modifying the RNA.The pH probe was then removed from the solution, rinsed with RNase-Free water, and blotted dry.Both conditions were tested in the same manner.

Performing solvent accessibility calculations
We computed solvent accessibility of the N1 position of A4, A5 and A8 over the three structures representing our three states.We used PDBs 1TLR, 1GID (removed GAAA tetraloop), and 1GID to represent our three states.We used freesasa ( https:// freesasa.github.io/python/ ), using the Lee Richards model with a probe radius of 1.0 Å.

Generating 3D model for TLR mutants with F ARF AR2
Hypothesizing that their binding orientation would closely resemble that of the wild type, given their similar thermodynamic profiles.In all models, we used the orientation from the wild-type to constrain the binding mode for each TLR mutation.This was performed by supplying the original 1C-12G pair and the G-C pair that closes the GAAA tetraloop from the crystal structure of the P4-P6 domain of the tetrahy-mena ribozyme (PDB: 1GID).This was performed using the rna_denovo command in Rosetta, which runs F ARF AR2.
rna_denovo -fasta test.fasta-secstruct_file test.secstruct_file-s start.pdb-minimize_rna true -nstruct 1000 Where the -fasta specifies a FASTA formatted file containing the sequences of the three strands.Like the example below.
> loop ggaaac > receptor_1 gauaugg > receptor_2 ccuaaaguc -secstruct_file specifies a secondary structure file that contains the dot bracket of the secondary structure matched with the primary sequence found in the FASTA file.An example is shown below.

Developing a quantitative DMS-MaPseq protocol to observe Mg 2+ -induced tertiary contact formation
To probe the thermodynamics of tertiary contact formation, we utilized a minimal tetraloop / tetraloop receptor (miniTTR) nanostructure as a model system (Figure 1 A).We used DMS-MaPseq to probe the structure of the miniTTR construct (Figure 1 B) (26)(27)(28).The tetraloop and tetraloop-receptor have near-zero mutation fraction (DMS reactivity) in 10 mM Mg 2+ , indicating that contact formation shields the N1 position of the four interacting adenines from methylation, consistent with previous DMS probing results (Figure 1 B, purple circles) ( 4 ).To observe the Mg 2+ -induced formation of the TL / TLR, we performed the standard DMS-MaPseq protocol with and without 10 mM Mg 2+ .We observed no significant difference in reactivity between the conditions (Figure 2 A).We postulated that the ionic strength (300 mM sodium cacodylate) in the standard DMS-MaPseq experiment enabled sodium to provide sufficient electrostatic screening without divalent counter ions, a previously observed effect ( 36 ,37 ).
To determine the effect of [Na + ] on the reactivity of the miniTTR construct, we performed DMS-MaPseq at six sodium cacodylate concentrations (300, 250, 200, 150, 100 and 50 mM) (Figure 2 , Supplementary Figure S1 ) with and without 40 mM Mg 2+ .The reactivity difference in the TL / TLR between the 0 and 40 mM Mg 2+ increased as the amount of [Na + ] decreased.The 50 mM sodium cacodylate condition demonstrated the largest effect (Figure 2 D).Moreover, the 0 mM Mg 2+ condition mirrors the reactivity pro- file of two negative control constructs that cannot form the tertiary contact ( Supplementary Figures S2 and S3 ).Lastly, a high correlation ( R2 = 0.96) was observed between the reactivity of the 300 and 50 mM sodium cacodylate conditions with 40 mM Mg 2+ (Figure 2 C).This correlation indicates that the decrease in sodium cacodylate does not cause observable changes in structure.It should be noted there is a larger pH change with the 50 mM condition over the course of the experiment, pH = 0.81, compared to the 300 mM condition ( pH = 0.25, Supplementary Figure S4 ).For this system, this pH change did not appear to affect the results.Thus, we concluded that 50 mM sodium cacodylate is the optimal buffer condition for observing Mg 2+ -induced tertiary contact formation by DMS-MaPseq.

DMS-MaPseq interrogates the behavior of individual nucleotides during TL / TLR formation
We performed DMS-MaPseq at 16 Mg 2+ concentrations to investigate how each adenine's DMS reactivity changes in the TL / TLR contact of the miniTTR scaffold.We engineered two control variants to isolate reactivity changes attributable to tertiary contact formation.The first control variant was a TL-knockout, which substitutes the GAAA tetraloop with a UUCG incapable of forming the TL / TLR contact.The second control was a TLR-knockout, where the TLR is replaced by a helix ( Supplementary Figure S2 ).First, we examined the reactivity change as a function of Mg 2+ concentration for the three adenines in the tetraloop (Figure 3 B).All three tetraloop adenines behave similarly, so we averaged their reactivity ( Supplementary Figure S5 ).We observe a rapid decrease in their average reactivity approaching 0 as Mg 2+ concentration increases (Figure 3 C, closed circles), indicative of TL / TLR formation.In contrast, the reactivity of tetraloop adenines within the TLR-knockout remains constant as Mg 2+ concentration increases (Figure 3 C, open circles).
Next, we analyzed the reactivity trends for each adenine in the tetraloop-receptor as a function of Mg 2+ concentration.A8 forms an A-A pair with the tetraloop that involves its N1 position; thus, it behaves similarly to the tetraloop's three adenines with a rapid reactivity decrease as Mg 2+ con-centration increases (Figure 3 F, closed circles).However, in the TL-knockout, A8 reactivity rapidly increases, followed by a plateau as a Mg 2+ concentration increases.(Figure 3 F, open circles).The rapid increase in reactivity suggests a conformational change induced by Mg 2+ .We observed a similar behavior of the A4 reactivity profile of the TLR as a function of Mg 2+ concentration.In the presence of the GAAA, A4's reactivity rapidly decreases, and in the TL-knockout, it increases and then plateaus.A4 does not form direct contact with the tetraloop but does have N1 positions facing out into the solvent if it retains the same conformation without the tetraloop present (Figure 3 D).A5, which forms the A platform under A8, undergoes a slight decrease in reactivity as a function of Mg 2+ concentration with the GAAA tetraloop present but not to the same magnitude as A4 and A8.Furthermore, A5's reactivity does not sharply increase with Mg 2+ concentration with the TL-knockout, suggesting its N1 atom's local environment does not significantly change with Mg 2+ concentration.
Based on the rapid increases in reactivity of A4 and A8 as a function of Mg 2+ concentration in the TL-knockout, we hypothesized that the TLR may undergo an Mg 2+ -induced conformational change independent of tetraloop binding.We examined the only apo structure of the tetraloop receptor in the absence of its tetraloop binding partner (Figure 3 G) ( 38 ).This structure was resolved without Mg 2+ and does not have the A-A platform.This is consistent with previous research demonstrating that the A-A platform does not form in low ionic strength without divalent ions ( 39 ,40 ).Thus, the rapid increase of reactivity of A4 and A8 is that upon adding Mg 2+ , we are observing the formation of an A-A platform, which forces both A4 and A8 into conformations that are much more solvent-exposed.These data are consistent with an A-A platform upon adding Mg 2+ ions but are not definitive and require additional experiments.

Measuring the effects of structural mutations destabilizing the TL / TLR interaction using qMaPseq
To measure the stability of the miniTTR TL / TLR tertiary contact, we measured the [Mg 2+ ] 1 from our 16-point Mg 2+ titration.We denote using DMS-MaPseq to measure thermodynamic parameters as quantitative DMS-MaPseq or (qMaPseq).This value is similar to our previous study with different buffer conditions (0.12 ± 0.03 mM) ( 4 ).To assess the sensitivity of qMaPseq, we generated three destabilizing mutations, each designed by extending one of the helices linking the tetraloop and its receptor (Figure 4 , Supplementary Figure S6 ).We hypothesize that these helix insertions will disrupt the precise alignment necessary for docking between the tetraloop and tetraloop-receptor.We performed a 16-point Mg 2+ titration with qMaPseq and determined observed a significantly higher [Mg 2+ ] 1 2 for each mutant of 1.10 ± 0.25, 1.37 ± 0.17 and 0.58 ± 0.03 mM for the helix H1, H2 and H3 insertions, respectively ( Supplementary Figures S7 -S9 ).These elevated [Mg 2+ ] 1 2 values indicate a reduced frequency of tertiary contact formation in the mutants compared to the wild-type scaffold.The wild-type construct has an average GAAA reactivity of 0.002 at 40 mM Mg 2+ .
The average GAAA reactivity at 40 mM Mg 2+ increases for each destabilized mutant and is linearly correlated with their [Mg 2+ ] 1 2 values ( Supplementary Figure S10 ).This indicates that at the 40 mM Mg 2+ , these mutants have their tertiary contact formed less frequently than the wild-type.
An intriguing aspect of these findings is the differential impact of the mutations.Insertions in helices H1 and H2 resulted in greater destabilization compared to H3.When their [Mg 2+ ] 1 2 values were converted to G, using the wild-type as a reference, H2 was destabilized by 1.13 kcal / mol, while H3 exhibited a destabilization of 0.54 kcal / mol.To elucidate why H3 maintains better compensation than H1 and H2, we analyzed the reactivity changes of each motif (3 × 3, IRES, and kink-turn) as a function of Mg 2+ .Overall, the reactivity changes of H3 in response to Mg2 + were more consistent with the wild-type than those of H1 or H2 ( Supplementary Figures S11 -S13 ).This trend was observed across all three motifs.In the 3 × 3 motif, A3 displayed significantly higher reactivity for insertions in H1 and H2 ( Supplementary Figure S11 ).In the IRES motif, A2 showed increased reactivity for insertions in H1 and H2 ( Supplementary Figure S13 ).Finally, in the kink-turn motif, C13 exhibited lower reactivity for insertions in H1 and H2 compared to the wild-type and H3 insertions ( Supplementary Figure S12 ).In each case, the extent of reactivity deviation from the wild-type was proportional to the observed destabilization, with H2 showing the greatest changes and H3 the smallest.These results demonstrate that qMaPseq can detect subtle destabilizations throughout the miniTTR scaffold, providing insights into the alignment of the TL / TLR contact.
qMaPseq can measure the G of TL / TLR mutants in a high throughput manner In a recent set of studies, Herschlag et al. characterized the binding affinities of approximately 3000 tetraloop receptor mutants to the GAAA tetraloop using the RNA-MaP platform ( 1 ,23 ).They identified a subset of mutations with the same binding mode as the wild-type tetraloop-receptor (11ntR).This subset presents a unique opportunity to compare the [Mg 2+ ] 1 2 of qMaPseq to existing thermodynamic measure-ments.We selected a representative sample of 98 mutants from this subset for our initial proof-of-concept experiment.The range of these binding affinities in this set to the GAAA tetraloop receptor ranges from -11.77 to -8.35 kcal / mol ( Supplementary Table S1 ).Each mutant sequence was incorporated into the miniTTR scaffold, replacing the standard 11ntR receptor.Our naming convention uses the TLR mutant sequence.For example, a mutation such as A2C has a sequence (C A U AA&U AUGG); thus, we name the miniTTR construct with this mutated TLR: C A U AA_U AUGG (Figure 5 A).We performed our standard 16-point [Mg 2+ ] titration protocol on this set of mutated TLR constructs.We determined the [Mg 2+ ] 1 2 values for 84 of the 98 mutants.Of the remaining 14, 11 were below the detection range of qMaPseq, exhibiting [Mg 2+ ] 1 2 values well beyond our testing range of 40 mM.These mutants corresponded to the lowest G values in our dataset, ranging from -9.50 to -8.30 kcal / mol, as outlined in Supplementary Table S1 .The other three constructs excluded were for issues with the DMS reactivity.In UCU AAA_C AUGA and CCU AC A_U ACGG, we observed significant reactivity in the G = C closing pair to the GAAA tetraloop, affecting its binding behavior ( Supplementary Figure S14 ).Finally, in CUU AAC_U AUGG, its GAAA tetraloop did not reduce reactivity as expected but changed into a new unknown reactivity pattern, suggesting an alternative structure ( Supplementary Figure S15 ).Our analysis revealed a strong correlation between the natural log of [Mg 2+ ] 1 2 values obtained from qMaPseq and the G values from RNA-MaP, achieving an R 2 of 0.64 (Figure 5 C).These results indicate that qMaPseq produces comparable thermodynamic measurements to the established RNA-MaP experiments.
We further noted that the average reactivity of the three As in the GAAA tetraloop at multiple [Mg 2+ ] concentrations was correlated with the [Mg 2+ ] 1 2 values.For example, at 7.5 mM Mg 2+ , simply taking the average reactivity of the three As in the GAAA tetraloop gives a coefficient of determination of 0.77 compared to the [Mg 2+ ] 1 2 values ( Supplementary Figure S16 ).Thus, we compared the natural log of the average reactivity of three adenines in the GAAA tetraloop at 7.5 mM [Mg 2+ ] to the G measured by RNA-MaP (Figure 6 D).Surprisingly, we achieved a higher correlation coefficient of R 2 = 0.68.This suggests that DMS reactivity values contain significant information content and are directly correlated to thermodynamic parameters.This correlation does not only exist at 7.5 mM Mg 2+ to the G but at many unique Mg 2+ concentrations ( Supplementary Figure S17 ).These results underscore the predictive power of a single DMS reactivity measurement, simplifying the experimental protocol and enhancing its accessibility.
To investigate general trends in nucleotide behavior over all variants, we performed hierarchical clustering ( Supplementary Figure S18 ) on the three adenines in the tetraloop, as well as A4, A5 and A8 in the tetraloop receptor over the entire Mg 2+ titration.This analysis revealed eight distinct clusters of behavior ( Supplementary Figure S19 ), characterized not by [Mg 2+ ] 1 2 but by changes in nucleotide reactivity as a function of Mg 2+ concentration.For instance, cluster 1 (comprising 5 variants) consistently features a C5A mutation and either an A or C at position 2 instead of a U.These mutations result in two notable effects: first, C5 exhibits significantly higher reactivity compared to A5 in other variants and the wild-type; second, A8 shows markedly reduced reactivity regardless of Mg 2+ levels, suggesting a novel conformation.Cluster 2 (encompassing 7 variants) involves the mutation of the G-U pair to either an A-C or C-A pair, which increases the reactivity of A8 at low Mg 2+ concentrations compared to other variants.Finally, cluster 4 (comprising 4 variants) includes constructs with atypical reactivity patterns, characterized by increased reactivity upon the addition of Mg 2+ , indicating the adoption of different conformational states.These variants were excluded from previous analyses due to their divergent behaviors.

qMaPseq identifies residue-specific structural features on tetraloop receptor mutants
In addition to the accessibility and simplicity, qMaPseq offers nucleotide-level insights into how mutations alter the 3D structure of the tetraloop receptor.We examined three common TLR mutations that yielded unique DMS reactivity profiles.For these TLR mutants, we generated potential 3D models utilizing Fragment Assembly of RNA with Full Atom Refinement (F ARF AR) ( 41 ,42 ).Previously, we used F ARF AR to build a 3D model of the 12-ntR receptor that could successfully explain the effects of known mutations ( 22 ,43 ).
First, we examined a G5C / U6C variant, which mutates the G-U pair to a C-C pair (Figure 6 A).This mutation occurs 31 times ( Supplementary Table S2 ).CCUAA C _ C AUGG is the only variant that contains this G-U to C-C mutation.This mutation produces a striking reactivity pattern change with and without 40 mM Mg 2+ .At 0 mM Mg 2+ , C5 and C6 have high and comparable reactivities of 0.021 and 0.032, respectively.However, at 40 mM Mg 2+ , C5's reactivity drops to 0.009, and C6's reactivity increases to 0.064.C6's reactivity is 7-fold higher than C5's, indicating a significant difference in the chemical environments of their N1 atoms (Figure 6 A).To understand this radial asymmetry in reactivity in this C-C, we examined the ten lowest energy structures generated by F ARAF AR (Figure 6 B).These models converged on a potential explanation.C5's N1 forms a hydrogen bond to one of the hydrogens on C6's N4 (Figure 6 C).This hydrogen bond likely shields C5's N1 from methylation and holds C6's N1 pointed out into solvent, leading to higher methylation levels.This unique asymmetric C-C pair is also in the miniTTR scaffold's kink-turn and yields a similar reactivity profile (Figure 6 D and Supplementary Figure S20 ).
Next, we examined a C9U mutation.Figure 6 E).This U is flipped out and does not form any contact with the tetraloop in the crystal structure of the TL / TLR complex.We examined the construct CCU AAG_U A C GG, which contains only this C9U mutation, where we hypothesize the C will also be flipped out.The reactivity at this C decreases rapidly at low [Mg 2+ ] levels, consistent with an initial conformation change induced by the Mg 2+ (Figure 6 J).The top 10 models converge highly for this sequence, and all have C9 flipped out-the reactivity at higher Mg 2+ levels is about 0.015, similar to other flipped Cs ( Supplementary Figure S21 ).
Finally, we investigated an intriguing insertion mutation, where an additional adenine is added at position 4, resulting in a sequence of three consecutive adenines (CCU A AA G_UA UGG).This modification poses questions about the role of the newly inserted A in the A-A platform for-mation.Comparing the reactivity of A4 and A5 in this mutant with the wild-type shows similar behaviors, suggesting they perform analogous roles.However, the inserted A exhibits a unique reactivity profile (Figure 6 J), not aligning entirely with the behavior of A4 or A5.This reactivity pattern resembles the C9U mutation in CCU AAG_U A C GG, leading us to hypothesize that this A is flipped out and does not directly participate in tetraloop binding.One of the F ARF AR models supports this hypothesis (Figure 6 I).However, it should be noted that there are lower energy models that have A5 flipped out, but that does not agree with our experimental data.These findings indicate that, unlike previously high-throughput methods, the nucleotide-level data generated by qMaPseq can provide details into RNA's 3D structure.

Discussion
In this study, we introduced qMaPseq, a novel chemical mapping technique designed to interrogate tertiary contact in a rapid, high-throughput fashion.A significant advantage of qMaPseq over prior methods is its accessibility: it does not require specialized equipment, extensive training, or sequence design.This makes it highly applicable to various tertiary contact systems that depend on Mg 2+ , such as pseudoknots and kissing loops, vital in the genomes of viruses like HIV-1 and S AR S-CoV-2 ( 26 ,32 ).qMaPseq is also agnostic to which chemical modifier is used and can easily be adapted to SHAPE, NAz, EDC or any other newly discovered modifier.
DMS-MaPseq has been demonstrated to work on complete viral genomes, eliminating size as a potential hindrance.Furthermore, there is no limit on how many sequences can be done in a single library.If it's possible to differentiate each unique sequence in principle, one can examine millions of unique sequences in a single reaction.This is achievable with current technology, as oligo pools of this size are commercially available, and sequencing continues to decrease in price.This scalability, coupled with decreasing sequencing costs and the burgeoning field of machine learning, opens up exciting possibilities for exploring the fundamental rules of tertiary contact formation at an unprecedented scale.
Our observation that the natural logarithm of the GAAA tetraloop's reactivity correlates more closely with the G of TLR mutants than the [Mg 2+ ] 1 2 was unexpected.This finding simplifies the data collection, requiring fewer data points and revealing that DMS reactivity values contain more nuanced information than the traditional binary high / low reactivity interpretation.This marks the first time single DMS reactivity measurements have been directly correlated with structural thermodynamics beyond the Watson-Crick base pairing.Previous research has indicated that high reactivity often signals a contact break, and low reactivity indicates formation (44)(45)(46).Our findings add a new quantitative fundamental ( G ) dimension.This methodology is readily extended to other noncanonical interactions.
In summary, qMaPseq emerges as a powerful tool for probing the 3D structure and stability of RNA, paving the way for future studies in RNA biology.Its capability to provide detailed, high-throughput analysis holds significant promise for advancing our understanding of RNA structure-function dynamics and potentially contributing to developing new therapeutic strategies targeting RNA in various diseases.

Figure 1 .
Figure 1.Ov ervie w of minimal tetraloop / tetraloop receptor construct.( A ) T he secondary str uct ure and tertiary str uct ure (PDB: 6DVK) are colored by secondary str uct ure motifs, helices are in black.( B ) The secondar y and tertiar y str uct ures are colored by DMS reactivity, indicating the tetraloop / tetraloop receptor is formed.The 4 adenines that can serve as probes of tertiary contact formation are highlighted in magenta.

Figure 2 .
Figure 2. Buffer effects on the DMS reaction and tertiary contact stability.( A ) Using 300 mM sodium cacodylate, compare 40 mM Mg 2+ to 0 mM Mg 2+ with no notable difference in the TL / TLR mutational fraction.( B ) A comparison between 40 mM Mg 2+ and 0 mM Mg 2+ with 50 mM sodium cacodylate sho w ed notable differences between the mutation fraction in the TL / TLR (highlighted in red), indicating tertiary contact formation.( C ) The high correlation between DMS mutation fraction at 50 mM and 300 mM sodium cacodylate at 10 mM Mg 2+ indicates that the lo w er buffer is sufficient to buffer the reaction and gives near identical data.( D ) The change in mutation fraction of the GAAA tetraloop between 0 and 40 mM Mg 2+ displaying increasing changes as the buffer concentration increases.

Figure 3 .
Figure 3.Nucleotide le v el conf ormational inf ormation as a function of Mg 2+ concentration.( A ) 3D str uct ure of the TL / TLR interaction.Colored by residue, red is the three adenines in the tetraloop, green is A4, orange is A5, and blue is A8 of the tetraloop receptor.(A) the 3D str uct ure of the TL / TLR interaction.( B ) The secondary str uct ure of the TL / TLR interaction.( C ) The mutation fraction of the average of the three adenines of tetraloop in filled circles.The mutation fraction of a TLR-knockout in open circles is missing the TLR ( Supplementary Figure S7 ) and thus cannot form the contact.( D ) The mutation fraction of A4 of the TLR in the wild-type (closed circles) and the TL-knockout (open circles).( E ) The mutation fraction of A5 of the TLR in the wild-type (closed circles) and the TL-knockout (open circles).( F ) The mutation fraction of A8 of the TLR in the wild-type (closed circles) and the TL-knockout (open circles).( G ) The proposed three state model of the TLR, A4 (green), A5 (orange), A8 (blue).

Figure 4 .
Figure 4. qMaPseq can measure the destabilization of TL / TLR due to helix lengthening.( A ) For a schematic of which helices will be lengthened, see Supplementary Figure S11 for secondary str uct ures of each insertion construct compared to the wild-type.( B ) Left is a comparison between the mutation fraction of the wild-type and the 3-bp helix 1 (H1) insertion construct.Middle, a comparison between the mutation fraction of the wild-type and the 3-bp helix 2 (H2) insertion construct.Right, a comparison between the mutation fraction of the wild-type and the 3-bp helix 3 (H3) insertion construct.

Figure 5 .
Figure 5.Comparison between qMaPseq results and known G values for 98 TLR mutations.( A ) A schematic of how mutant TLRs were inserted into the miniTTR scaffold.( B ) Comparison between Mg 2+ titrations of the wild-type and two TLR mutants.( C ) Correlation plot between the natural log of [Mg 2+ ] 1 2 and the G measured by the RNA-MaP experiments from Bonilla et al. for each TLR mutation.( D ) Correlation plot between the natural log of a v erage DMS reactivity of the three adenines compared to the G measured by the RNA-MaP experiments from Bonilla et al. for each TLR mutation.

Figure 6 .
Figure 6.DMS reactivity signatures indicate 3D structure features in TLR mutants.( A ) DMS reactivity of CCUAAC_CAUGG at 40 mM Mg 2+ displaying the unique DMS signature of the C-C mutation highlighted with arrows.( B ) F ARF AR generated a model of CCUAAC_CAUGG TLR mutant, and the C-C mismatch is bo x ed. ( C ) The F ARF AR model of the C-C mismatch, which has a unique asymmetrical orientation, with the right C turned to form a h y drogen bond with the N1 position of the left C. ( D ) The C-C mismatch from the kink-turn motif in the miniTTR scaffold (PDB: 6DVK) has a unique asymmetrical orientation, with the right C turned to form a hydrogen bond with the N1 position of the left C.This asymmetry is likely the cause of its sk e w ed DMS reactivity, where the right C is o v er 10-f old more reactiv e than the left C. ( E ) DMS reactivity of CCU AAG_U ACGG with 40 mM Mg 2+ with a C9U mutation.( F ) In the F ARAF AR model of the C9U mutation, this C is flipped out like the wild-type U. ( G ) The DMS reactivity as a function of Mg2 + where the C le v els out to reactivity indicating is consistently solvent accessible.( H ) DMS reactivity of CCU AAAG_U AUGG with 40 mM Mg 2+ with an A insertion.( I ) In the F ARAF AR model of the CCU AAAG_U AUGG, the inserted A is flipped out and does not participate in the A-A platform.( J ) The DMS reactivity as a function of Mg 2+ where the inserted A levels out to reactivity, indicating that it is consistently solvent accessible and has a unique profile that differs from the As in the A-A platform.